reviewer 2
We thank the reviewers for the feedback and comments, in what follows we address specific comments made by the 1 reviewers 2 Reviewer
I do not completely understand (apart for some parts of the proofs) why refer to these functions as Graph-based. Boolean k-ary functions may be thought of as hyper-graphs. The definition shouldn't be unusual and it will be clarified to avoid any possible This is completely analogous to the standard empirical distribution for hypotheses classes. It might be helpful to summarise, ..., some basic properties of this new notion of VC dimension... ..., is there a Sauer-Shelah type upper bound on the size of the class in terms of the graph VC dimension? VC dimension entail small graph VC dimension). Shelah Lemma for graph VC dimension, indeed this is noteworthy and we should discuss this in the main text.
Mutual Concerns of All Reviewers
" The results around Figure 1 are difficult to understand; Some figure and table captions and some mathematical derivations We will rewrite the corresponding sections and fix the issues you have pointed out. Thank you all very much! " What factors are critical for the performance of the proposed architectures? Empirically, such "scalability" is observed for that larger instances of the 2 architectures yield better performance. Larger difference with more layers is expected on more complex tasks.